Methods and systems for media  annotation, selection and display of additional information associated with a region of interest in video content

ABSTRACT

Methods, systems, and processor-readable media for selecting a region within a particular frame of video content to access additional information about an area of interest associated with the region within the particular frame, and displaying the additional information, in response to selecting the region associated with the particular frame of video content to access the additional information about the area of interest associated with the region within the particular frame. A selection packed can be generated, which includes frame selection data associated with the particular frame of video content. The frame selection data can include data that is sufficient to identify the particular frame of video content.

CROSS REFERENCE TO RELATED APPLICATIONS

This patent application is a continuation of U.S. patent application Ser. No. 12/976,148, entitled “Flick Intel Annotation Methods and Systems,” which was filed on Dec. 22, 2010 and which is incorporated herein by reference in its entirety. U.S. patent application Ser. No. 12/976,148 in turn claims the priority and benefit of U.S. provisional patent application 61/291,837, entitled “Systems and Methods for obtaining background data associated with a movie, show, or live sporting event”, filed on Dec. 31, 2009 and of U.S. Provisional Patent Application No. 61/419,268, filed Dec. 3, 2010, entitled “Flick Intel Annotation Systems and Webcast Infrastructure”. This patent application therefore claims priority to U.S. Provisional Patent Application Ser. No. 61/291,837 and U.S. Provisional Patent Application Ser. No. 61/419,268, which are incorporated herein by reference in their entireties.

TECHNICAL FIELD

Embodiments relate to video content, video displays, and video compositing. Embodiments also relate to computer systems, user input devices, databases, and computer networks.

BACKGROUND OF THE INVENTION

People have watched video content on televisions and other audio-visual devices for decades. They have also used gaming systems, personal computers, handheld devices, and other devices to enjoy interactive content. They often have questions about places, people, and things appearing as the video content are displayed, and about the music they hear. Databases containing information about the content such as the actors in a scene or the music being played already exist and provide users with the ability to learn more.

The existing database solutions provide information about elements appearing in a movie or scene, but only in a very general way. A person curious about a scene element can obtain information about the scene and hope that the information mentions the scene element in which the person is interested. Systems and methods that provide people with the ability to select a specific scene element and to obtain information about only that element are needed.

BRIEF SUMMARY

The following summary is provided to facilitate an understanding of some of the innovative features unique to the embodiments and is not intended to be a full description. A full appreciation of the various aspects of the embodiments can be gained by taking the entire specification, claims, drawings, and abstract as a whole.

It is therefore an aspect of the embodiments that a media device can provide video content to a display device and that a person can view the video content as it is presented on the display device. A series of scenes or a time varying series of frames along with any audio dialog, music, or sound effects are examples of video content.

It is another aspect of the embodiments that the person can choose a region on the display device. A region can be chosen with a pointing device or any other form of user input by which the person can indicate a spot on the display device and select that spot. Frame specification data can be generated when the person chooses the region. The frame specification data can identify a specific scene or frame within the video content.

It is yet another aspect of the embodiments to provide an element identifier based on the region and the frame specification data. Element identifiers are uniquely associated with scene elements. The element identifier can be obtained by querying an annotation database that relates element identifiers to regions and frame specification data. Note that the element identifier in some embodiments can be provided by a human worker who views the scene or frame, looks to the region, and reports what appears at that location.

A number of embodiments, preferred and alternative are disclosed herein. For example, in an embodiment, a method can be implemented, which includes selecting a region within a particular frame of video content to access additional information about an area of interest associated with the region within the particular frame; and displaying the additional information, in response to selecting the region associated with the particular frame of video content to access the additional information about the area of interest associated with the region within the particular frame. In another embodiment, a step can be provided for generating a selection packet that includes frame selection data associated with the particular frame of video content. In yet another embodiment, the frame selection data can include data that is sufficient to identify the particular frame of video content.

In another embodiment, a step can be provided for detecting the region. In another embodiment, a step can be implemented for accessing the additional information from an annotated content stream. In still another embodiment, a step can be provided for storing the additional information in an annotation database. In another embodiment, the region within the particular frame of the video content can be a coordinate. In another embodiment, the region within the particular frame of the video content can be a plurality of coordinates. In still another embodiment, the region within the particular frame of the video content can include data indicative of a particular region.

In another embodiment, a system can be implemented, which includes, for example, a computer-usable medium embodying computer code. Such computer program code can include instructions executable by the processor and configured for selecting a region within a particular frame of video content to access additional information about an area of interest associated with the region within the particular frame; and displaying the additional information, in response to selecting the region associated with the particular frame of video content to access the additional information about the area of interest associated with the region within the particular frame.

In another embodiment, such instructions can be configured for generating a selection packet that includes frame selection data associated with the particular frame of video content. In still another embodiment, the aforementioned frame selection data can include data that is sufficient to identify the particular frame of video content. In still other embodiments, such instructions can be configured for detecting the region. In yet other embodiments, such instructions can be configured for accessing the additional information from an annotated content stream. In still other embodiments, such instructions can be further configured for storing (initially or at other times) the additional information in an annotation database.

In yet another embodiment, a processor-readable medium storing code representing instructions to cause a processor to perform a process can be provided. Such code can comprise code to, for example, select a region within a particular frame of video content to access additional information about an area of interest associated with the region within the particular frame; and display the additional information, in response to selecting the region associated with the particular frame of video content to access the additional information about the area of interest associated with the region within the particular frame. In still another embodiment, such code can further comprise code to generate a selection packet that includes frame selection data associated with the particular frame of video content. In yet another embodiment, the frame selection data can include data that is sufficient to identify the particular frame of video content. In still another embodiment, such code can include code to detect the region. In still another embodiment, such code can comprise code to access the additional information from an annotated content stream and store the additional information in an annotation database.

BRIEF DESCRIPTION OF THE DRAWINGS

The accompanying figures, in which like reference numerals refer to identical or functionally similar elements throughout the separate views and which are incorporated in and form a part of the specification, further illustrate aspects of the embodiments and, together with the background, brief summary, and detailed description serve to explain the principles of the embodiments.

FIG. 1 illustrates element data being presented on a second display in response to the selection of a scene element on a first display in accordance with aspects of certain embodiments;

FIG. 2 illustrates an annotation database providing element identifiers in response to a person selecting scene elements in accordance with aspects of the embodiments;

FIG. 3 illustrates an annotation service providing element identifiers in response to a person selecting scene elements in accordance with aspects of the embodiments; and

FIG. 4 illustrates an annotated content stream passing to a media device such that the media device produces element data in accordance with aspects of certain embodiments.

DETAILED DESCRIPTION

The particular values and configurations discussed in these non-limiting examples can be varied and are cited merely to illustrate at least one embodiment and are not intended to limit the scope thereof. In general, the figures are not to scale.

The embodiments will now be described more fully hereinafter with reference to the accompanying drawings, in which illustrative embodiments of the invention are shown. The embodiments disclosed herein can be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete and will fully convey the scope of the invention to those skilled in the art. Like numbers refer to like elements throughout. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items.

Video content is a time varying presentation of scenes or video frames. Each frame can contain a number of scene elements such as actors, foreground items, background items, or other items. A person enjoying video content can select a scene element by specifying a screen region (e.g., a coordinate, group of coordinates, particular area, etc.) while the video content plays. Frame specification data identifies the specific frame or scene being displayed when the region is selected. The region in combination with the frame specification data is sufficient to identify the scene element that the person has chosen. Information about the scene element can then be presented to the person. An annotation database can associate scene elements with frame specification data and regions.

FIG. 1 illustrates element data being presented on a second display 119 in response to the selection of a scene element on a display 101 in accordance with aspects of certain embodiments. A media device 104 passes video content to the display 101 to be viewed by a person. The person can manipulate a selection device 112 to choose a region or coordinate(s) 102 (e.g., data indicative of a region, a coordinate, groups of coordinates, etc.) on a display device 101. The data indicative of the region or coordinate(s) 102 can then be passed to a media device 104. In some embodiments the selection device can detect the region or coordinate(s) 102. For example, the selection device 112 can detect the locations of emitters 106 and infer the screen position being pointed at from those emitter locations. In other embodiments the display 101 can detect the region or coordinate(s) 103. For example, the selection device can emit a light beam that the display device detects. Other common coordinate selection means include mice, trackballs, and touch sensors. More advanced pointing means can observe the person's body or eyeballs to thereby determine a coordinate. Clicking a button or some other action can generate an event indicting that a scene element is chosen.

The media device 104 can generate a selection packet 107 that includes frame selection data and the region or coordinate(s) 102. The frame selection data is data that is sufficient to identify a specific frame or scene. For example, the frame selection data can be a media tag 108 and a timestamp 109. The media tag 108 can identify a particular movie, show, sporting event, advertisement, video clip, scene or other unit of video content. A timestamp 109 specifies a time within the video content. In combination, a media tag and timestamp can specify a particular frame from amongst all the frames of video content that have ever been produced.

The frame selection packet 107 can be formed into a query for an annotation database 111. The annotation database 111 can contain associations of element identifiers associated with frame selection data and regions (e.g., data indicative of a particular region or groups of regions, a coordinate, groups of coordinates, etc.). As such, the annotation database 111 can produce an element identifier 113 in response to the query. The element identifier 113 can identify a person 114, an item 115, music 116, a place 117, or something else.

The element identifier 113 can then be passed to another server 118 that responds by producing element data for presentation to the person. Examples of element data include, but are not limited to: statistics on a person such as an athlete; a picture of a person, object or place; an offer for purchase of an item, service, or song; and links to other media in which a person, item, or place appears.

FIG. 2 illustrates an annotation database 111 providing element identifiers 211 in response to a person selecting scene elements in accordance with aspects of the embodiments. An annotation service/module 202 can produce annotated content 203 by annotating content 201. An annotation module is a device, algorithm, program, or other means that automatically annotates content. Image recognition algorithms can locate items within scenes and frames and thereby automatically provide annotation data. An annotation service is a service provider that annotates content. An annotation service provider can employ both human workers and annotation modules.

Annotation is a process wherein scene elements, each having an element identifier, are associated with media tags and space time ranges. A space time range identifies a range of times and positions at which a scene element appears. For example, a car can sit unmoving during an entire scene. The element identifier can specify the make, model, color, and trim level of the car, the media tag can identify a movie containing the scene, and the space time range can specify the time range of the movie scene and the location of the car within the scene.

The content 201 can be passed to a media device 104 that produces a media stream 207 for presentation on a display device 206. A person 205 watching the display device 206 can use a selection device 112 to select a region on the display device 206. A selection packet 107 containing the coordinate and some frame specification data can then be passed to the annotation database 111 which responds by identifying the scene element 211. An additional data server 118 can produce element data 212 for that identified scene element 211. The element data 212 can then be presented to the person.

FIG. 3 illustrates an annotation service providing element identifiers in response to a person selecting scene elements in accordance with aspects of the embodiments. The embodiment of FIG. 3 differs from that of FIG. 2 in that the content 201 is not necessarily annotated before being viewed by the person 205. The selection packet 107 is passed to the annotation service 301 where a human worker 302 or annotation module 303 determines what scene element the person 205 selected and creates a new annotation entry for incorporation into the annotation database 111.

FIG. 4 illustrates an annotated content stream 401 passing to a media device 104 such that the media device 104 produces element data 407 in accordance with aspects of certain embodiments. Annotated content, such as annotated content 203 of FIG. 2, can be passed as an annotated content stream 401 to the media device 104. The annotated content stream 401 can include a content stream 402, element stream 403, and element data 406. The media device 104 can then pass the content for presentation on the display 206 and store the element data 406 and the data in the element stream 403. The data in the element stream 403 can be formed into an annotation database with the possible exception that no media tag is needed. No media tag is needed because all the annotations refer only to the content stream 402. As such, the element stream 403 is illustrated as containing only space time ranges 404 and element identifiers 405.

The media device 104, having assembled an annotation database and having stored element data 406, can produce element data 407 for a scene element selected by a person 205 without querying remote databases or accessing remote resources.

Note that in practice, the content stream 402, element stream 403, and element data 406 can be transferred separately or in combination as streaming data. Means for transferring content, annotations, and element data include TV signals and storage devices such as DVD disks or data disks. Furthermore, the element data 406 can be passed to the media device 104 or can be stored and accessed on a remote server.

It will be appreciated that variations of the above-disclosed and other features and functions, or alternatives thereof, may be desirably combined into many other different systems or applications. Also, that various presently unforeseen or unanticipated alternatives, modifications, variations or improvements therein may be subsequently made by those skilled in the art which are also intended to be encompassed by the following claims.

The particular values and configurations discussed in these non-limiting examples can be varied and are cited merely to illustrate at least one embodiment and are not intended to limit the scope thereof.

The embodiments will now be described more fully hereinafter with reference to the accompanying drawings, in which illustrative embodiments of the invention are shown. The embodiments disclosed herein can be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete and will fully convey the scope of the invention to those skilled in the art. Like numbers refer to like elements throughout. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items.

The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.

As will be appreciated by one skilled in the art, the present invention can be embodied as a method, data processing system, or computer program product. Accordingly, the present invention may take the form of an entire hardware embodiment, an entire software embodiment or an embodiment combining software and hardware aspects all generally referred to herein as a “circuit” or “module.” Furthermore, the present invention may take the form of a computer program product on a computer-usable storage medium having computer-usable program code embodied in the medium. Any suitable computer readable medium may be utilized including hard disks, USB Flash Drives, DVDs, CD-ROMs, optical storage devices, magnetic storage devices, etc.

Computer program code for carrying out operations of the present invention may be written in an object oriented programming language (e.g., Java, C++, etc.). The computer program code, however, for carrying out operations of the present invention may also be written in conventional procedural programming languages such as the “C” programming language, in a visually oriented programming environment such as, for example, VisualBasic, or in functional programming languages such as LISP or Erlang.

The program code may execute entirely on the user's computer, partly on the users computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer. In the latter scenario, the remote computer may be connected to a user's computer through a local area network (LAN) or a wide area network (WAN), wireless data network e.g., WiFi, Wimax, 802.xx, and cellular network or the connection may be made to an external computer via most third party supported networks (for example, through the Internet using an Internet Service Provider).

The invention is described in part above with reference to flowchart illustrations and/or block diagrams of methods, systems, computer program products, and data structures according to embodiments of the invention. It will be understood that each block of the illustrations, and combinations of blocks, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function/act specified in the block or blocks.

Note that computer program instructions and other process-readable media discussed herein may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions/acts specified in the block or blocks.

Based on the foregoing, it can be appreciated that a number of embodiments, preferred and alternative, are disclosed herein. For example, in one embodiment, a method can be implemented, which includes selecting a region within a particular frame of video content to access additional information about an area of interest associated with the region within the particular frame; and displaying the additional information, in response to selecting the region associated with the particular frame of video content to access the additional information about the area of interest associated with the region within the particular frame. In another embodiment, a step can be provided for generating a selection packet that includes frame selection data associated with the particular frame of video content. In yet another embodiment, the frame selection data can include data that is sufficient to identify the particular frame of video content.

In another embodiment, a step can be provided for detecting the region. In another embodiment, a step can be implemented for accessing the additional information from an annotated content stream. In still another embodiment, a step can be provided for storing the additional information in an annotation database. In another embodiment, the region within the particular frame of the video content can be a coordinate. In another embodiment, the region within the particular frame of the video content can be a plurality of coordinates. In still another embodiment, the region within the particular frame of the video content can include data indicative of a particular region.

In another embodiment, a system can be implemented, which includes, for example, a computer-usable medium embodying computer code. Such computer program code can include instructions executable by the processor and configured for selecting a region within a particular frame of video content to access additional information about an area of interest associated with the region within the particular frame; and displaying the additional information, in response to selecting the region associated with the particular frame of video content to access the additional information about the area of interest associated with the region within the particular frame.

In another embodiment, such instructions can be configured for generating a selection packet that includes frame selection data associated with the particular frame of video content. In still another embodiment, the aforementioned frame selection data can include data that is sufficient to identify the particular frame of video content. In still other embodiments, such instructions can be configured for detecting the region. In yet other embodiments, such instructions can be configured for accessing the additional information from an annotated content stream. In still other embodiments, such instructions can be further configured for storing (initially or at other times) the additional information in an annotation database.

In yet another embodiment, a processor-readable medium storing code representing instructions to cause a processor to perform a process can be provided. Such code can comprise code to, for example, select a region within a particular frame of video content to access additional information about an area of interest associated with the region within the particular frame; and display the additional information, in response to selecting the region associated with the particular frame of video content to access the additional information about the area of interest associated with the region within the particular frame. In still another embodiment, such code can further comprise code to generate a selection packet that includes frame selection data associated with the particular frame of video content. In yet another embodiment, the frame selection data can include data that is sufficient to identify the particular frame of video content. In still another embodiment, such code can include code to detect the region. In still another embodiment, such code can comprise code to access the additional information from an annotated content stream and store the additional information in an annotation database.

It will be appreciated that variations of the above-disclosed and other features and functions, or alternatives thereof, may be desirably combined into many other different systems or applications. Also, that various presently unforeseen or unanticipated alternatives, modifications, variations or improvements therein may be subsequently made by those skilled in the art which are also intended to be encompassed by the following claims. 

1. A method, comprising: selecting a region within a particular frame of video content to access additional information about an area of interest associated with said region within said particular frame; and displaying said additional information, in response to selecting said region associated with said particular frame of video content to access said additional information about said area of interest associated with said region within said particular frame.
 2. The method of claim 1 further comprising generating a selection packet that includes frame selection data associated with said particular frame of video content.
 3. The method of claim 2 wherein said frame selection data comprises data that is sufficient to identify said particular frame of video content.
 4. The method of claim 1 further comprising detecting said region.
 5. The method of claim 1 further comprising accessing said additional information from an annotated content stream.
 6. The method of claim 1 further comprising initially storing said additional information in an annotation database.
 7. The method of claim 1 wherein said region within said particular frame of said video content comprises a coordinate.
 8. The method of claim 1 wherein said region within said particular frame of said video content comprises a plurality of coordinates.
 9. The method of claim 1 wherein said region within said particular frame of said video content comprises data indicative of a particular region.
 10. A system, comprising: a processor; and a computer-usable medium embodying computer code, said computer program code comprising instructions executable by said processor and configured for: selecting a region within a particular frame of video content to access additional information about an area of interest associated with said region within said particular frame; and displaying said additional information, in response to selecting said region associated with said particular frame of video content to access said additional information about said area of interest associated with said region within said particular frame.
 11. The system of claim 10 wherein said instructions are further configured for generating a selection packet that includes frame selection data associated with said particular frame of video content.
 12. The system of claim 11 wherein said frame selection data comprises data that is sufficient to identify said particular frame of video content.
 13. The system of claim 10 wherein said instructions are further configured for detecting said region.
 14. The system of claim 10 wherein said instructions are further configured for accessing said additional information from an annotated content stream.
 15. The system of claim 10 wherein said instructions are further configured for initially storing said additional information in an annotation database.
 16. A processor-readable medium storing code representing instructions to cause a processor to perform a process, said code comprising code to: select a region within a particular frame of video content to access additional information about an area of interest associated with said region within said particular frame; and display said additional information, in response to selecting said region associated with said particular frame of video content to access said additional information about said area of interest associated with said region within said particular frame.
 17. The processor-readable medium of claim 16 wherein said code further comprises code to generate a selection packet that includes frame selection data associated with said particular frame of video content.
 18. The processor-readable medium of claim 17 wherein said frame selection data comprises data that is sufficient to identify said particular frame of video content.
 19. The processor-readable medium of claim 16 wherein said code further comprises code to detect said region.
 20. The processor-readable medium of claim 16 wherein said code further comprises code to: access said additional information from an annotated content stream; and store said additional information in an annotation database. 